home *** CD-ROM | disk | FTP | other *** search
-
- Exception Reports Explained
-
- Quarterdeck Technical Note #142 Filename: EXCEPT13.TEC
- by Michael Bolton CompuServe: EXCEPT.TEC
- Last revised: 3/29/95 Category: QEMM
-
- Subject: A detailed explanation of what QEMM's Exception #6, #12,
- and #13 messages mean, why they are reported, and some of
- the steps that can be taken to identify their causes.
- See also TROUBLE.TEC for general troubleshooting
- suggestions.
-
- Q. What are processor exceptions? What is an Exception #6, #12,
- or #13?
-
- Q. What does the QEMM Exception message mean? How can it help me?
-
- Users of QEMM may sometimes encounter a report that an attempt has
- been made to execute an invalid instruction. It is almost certain
- that QEMM, in and of itself, is not the cause of Exception
- problems, though QEMM's memory managment may come into conflict
- with other hardware and software on your system.
-
- In this technical note, we explain in detail what a processor
- exception is, how you can interpret the information provided by
- the exception report, and what you can do to remedy the situation
- in the unhappy event that the techniques in TROUBLE.TEC don't
- provide relief from the problem.
-
- To answer the questions above, it's worthwhile to examine the
- Exception report bit by bit.
-
- "The processor has notified QEMM that an attempt has been made to
- execute an invalid instruction..."
-
- Exceptions are the processor's response to unusual, invalid, or
- special conditions in the normal operation of the 80386 processor
- and others in its family. (The 80386 family includes the 80386SX,
- the 80386DX, the 80486SX, and the 80486DX processors; their memory
- management architecture is essentially the same. In this
- document, the term "386" refers to any and all of these
- processors.) Exceptions cause the 386 processor to stop what it's
- doing and to try to react to the condition that caused the
- exception. QEMM is designed to capture some of these exceptions
- -- particularly those caused by protection faults or invalid
- instructions, which could cause a program or the entire system to
- crash -- and display a report to the user. When the processor
- encounters an instruction that it does not want to execute, it
- passes control to QEMM. QEMM's protected mode INT 6, INT 12, or
- INT 13 handler posts the Exception message. Neither DOS nor
- Microsoft's EMM386.EXE have as sophisticated protected mode
- handlers, so if an exception occurs using only DOS or EMM386.EXE,
- your system may simply crashes and leave you without a report.
-
-
- Q. What causes an Exception?
-
- "...This may be due to an error in one of your programs, a
- conflict between tw o pieces of software, or a conflict between
- a piece of hardware and a piece of software...."
-
- The exception reported is most commonly #13, the General
- Protection Fault exception. This indicates that a program has
- tried to execute an invalid or privileged instruction. On the 386
- processor, programs can run at varying privilege levels, so that
- the processor can better protect application programs (which
- generally run at lower privilege levels) from crashing the
- operating system or control program (which typically runs at the
- highest privilege level). DOS and QEMM do not enforce this
- protection, but QEMM can report when a program running at the
- lowest privilege level tries to execute a privileged instruction.
- The result may be a system crash, but QEMM does provide a report
- before the crash happens.
-
- Invalid instructions are harder to classify, for indeed Exception
- #13 is something of a catch-all. Some examples of invalid
- instructions include:
-
- - 386-specific instructions that are disallowed when the processor
- is in virtual 8086 mode. The processor is in this mode whenever
- QEMM is in an ON state -- essentially when it is providing
- expanded memory or High RAM.
-
- - A program trying to write data to a segment that has been marked
- as executable or read-only (the data could overwrite program
- code).
-
- - Trying to run program code from a data segment (if data is read
- as code, it will be a series of meaningless or nonsensical
- instructions -- which, if executed, could jump to invalid
- addresses or overwrite the operating system)
-
- - Exceeding the limit of a segment. Segments in virtual 8086 mode
- are not permitted to exceed FFFFh (65535 decimal) bytes or to
- fall below 0 bytes. Neither a program instruction nor a memory
- reference may span the boundary of a segment.
-
- It is this last error which is the most common; this is a problem
- also known as "segment wrap", which we will discuss later. Again,
- QEMM is designed to trap and report these errors, but it cannot
- defend against the system crashes that they may cause.
-
- Occasionally Exception #12, indicating a stack exception, will be
- reported. This is a protection violation very similar to Exception
- #13, but is one in which the stack segment is involved in some
- way. Although generally no easier to solve, it is a somewhat less
- general report than Exception #13.
-
- Exception #6 may also be reported. This indicates that a program
- has tried to execute an invalid opcode. Machine instructions are
- stored as sequences of bytes in memory. These sequences are
- fetched from memory and decoded by the processor into
- machine-language instructions. When the processor encounters a
- sequence of bytes for which there is no corresponding
- machine-language instruction, the processor generates an Exception
- #6 and QEMM reports the Exception to you.
-
- Very infrequently, an Exception #0 is reported. This is not
- intentional; it is usually the result of QEMM's stack being
- corrupted while QEMM was trying to report another exception, or is
- the result of some other system error.
-
- It is important to remember that in the vast majority of cases,
- QEMM is not involved with the problem, but is merely reporting it.
- Most often, the problem is simply a bug in the offending program.
-
- Q. What do I do now?
-
- "...It is likely that the system is unstable now and should be
- rebooted...."
-
- QEMM is designed to offer the user the opportunity to terminate
- the offending program, or to reboot the computer, but often the
- damage has already been done by the time that the Exception is
- trapped and reported. In this instance, you may find the computer
- locked regardless of what you choose. If the computer is indeed
- hung, you should write down the information on the screen and then
- reboot the machine.
-
- While QEMM's Exception reports can be cryptic to non-programmers
- -- or to programmers who have little experience with assembly
- language -- the information that they provide can sometimes be
- quite helpful. Exception reports can help you to identify which
- program has triggered the exception message, what the invalid
- instruction was, and the state of the processor's registers when
- the error occurred. Armed with this information, you may be able
- to help the developer of the offending application to determine
- the problem that led to the exception, and thus the developer may
- be able to provide a temporary workaround or a permananent fix.
-
- The exception report is divided into three parts --
-
- 1) The vector or class of exception, and its location and error
- code. The location of the exception indicates the address in
- memory at which the invalid instruction was attempted. The
- program loaded at this address (if indeed a program is loaded
- there) should be noted by running Manifest.
-
- Exception #13 at 1B12:0103, error code: 0000
-
- In this example, the program loaded at address 1B12:xxxx is
- automatically your suspect. Reboot your system in the same
- configuration as you had when the Exception #13 occurred. If the
- problem happened during an application program, don't load the
- application just yet. Load Manifest instead, and have a look at
- First Meg / Programs.
-
- Memory Area Size Description
- 03D1 - 0465 2.3K COMMAND
- 0466 - 046A 0.1K (04C0)
- 046B - 0483 0.4K COMMAND Environment
- 0484 - 0487 0.1K COMMAND Data
- 0488 - 0498 0.3K DV Environment
- 0499 - 04BE 0.6K DV
- 04BF - 1A38 85K DV Data
- 1A39 - 1A52 0.4K COMMAND Data
- 1A53 - 1AE7 2.3K COMMAND
- 1AE8 - 1B00 0.4K COMMAND Environment
- 1B01 - 7E4F 397K [Available]
-
- The sample Exception #13 above happened in that Available range,
- so it was the program that would have been loaded had we not
- loaded Manifest -- that is, the application program. If you have
- a TSR loaded low, and the Exception #13 is occuring within that
- TSR's address space, then it is your suspect, rather than the
- application. In any case, the program whose code falls into the
- range in which the Exception #13 occurred likely has a problem of
- some type.
-
- 2) The second part of the Exception #13 message is the register
- dump:
-
- AX=0000 BX=0000 CX=0000 DX=0000 SI=FFFF DI=0000 BP=0000
- DS=1B12 ES=1B12 SS=1B12 SP=FFFE Flags=7246
-
- The registers are the temporary storage areas on the 80386 chip
- which are used for calculations and addressing. Each register is
- two bytes (16 bits) in size, so each register is capable of
- holding a value from 0 to FFFF (hexadecimal), or from 0 to 65335
- (decimal).
-
- If any registers here are 0000 or FFFF, it's possible that you
- could be looking at a segment wrap. A segment wrap happens
- whenever a program attempts to access -- read from or write to --
- something beyond the limit of a segment. A word value consists of
- two adjacent bytes; if a word value were to begin at FFFF (which
- is the last byte of a segment), the second byte of that value will
- be outside the segment -- and an attempt to read from or write to
- that word will thus cause a protection violation. Similarly, a
- doubleword is four adjacent bytes; if any of the last three bytes
- are outside of the segment limit, a segment wrap and a protection
- violation will occur when an access is attempted.
-
- On an 8086 processor, it's actually possible for a segment wrap to
- occur without a protection violation, simply because the 8086 has
- no hardware protection at all. What is the byte after the last
- byte of a segment? On the 8086, it's the FIRST byte of the same
- segment. (Non-technical analogy for poker players: Queen - King -
- Ace - Two - Three is a straight in the penny-ante poker game
- played when the 8086 processor is dealing. The 386 processor is a
- very strict dealer, and does not permit this.) It is possible
- (though unlikely) for a program to continue without a crash on an
- 8086 processor when two "adjacent" bytes are actually a whole
- segment apart; it could theoretically be possible on a 386 too,
- but the exception is generated before the memory access can be
- completed.
-
- This sort of problem is seen most commonly during a string move --
- the program is copying a whole block of data from one range of
- addresses to another. You may not understand this, and actually
- it doesn't matter if you don't. Briefly, though, SI stands for
- Source Index; DI stands for Destination Index. These two registers
- are used for string instructions -- instructions that load or copy
- information sequentially. String instructions are extremely
- powerful and useful, since they allow the developer to deal with
- large amounts of data in a single pass. A byte or a word value
- can be fetched from memory by one string instruction, dealt with,
- and then the result can be copied to a new memory location with a
- second string instruction -- and all this can be managed with an
- extremely tight, fast loop. An entire range of addresses (for
- example, in screen memory) can even be filled with a given value
- using a single instruction. The catch here is that the string
- instruction is only valid as long as the value of the SI or DI
- register does not fall outside the range addressable by these
- registers. If either one of these tries to exceed FFFF (or tries
- to fall below 0000), as a string is being copied from one region
- of memory to another, you'll get a protection violation.
-
- 3) Instruction: A5 CC 00 00 00 00 00 00 00 00 00 00 00 00 00
- Do you want to (T)erminate the program or (R)eboot?
-
- This is the invalid instruction that the program was trying to
- execute when the processor stopped it. Since most humans don't
- have a hope of interpreting machine language by looking at the
- opcodes, you can get a better interpretation of what is going on
- by examining this instruction with a program that can render
- machine codes into assembly language. (Well... it's better than
- nothing.) To do so, go into DEBUG; type DEBUG at the DOS prompt.
-
- Enter the values from the Instruction line by typing
-
- E 100
-
- at DEBUG's hyphen prompt, and then entering each byte (pair of digits) from
- the instruction line. Follow each byte with a space.
-
- (As a bonus -- if you're running under DESQview, you can Mark the
- information from the Exception #13 report, and Transfer it into
- DEBUG running in a different Big DOS window.)
-
- If most of the bytes begin with a 4, 5, 6, or 7, there's a good
- chance that you're seeing a program trying to execute text,
- thinking that text to be code. This can happen in several
- circumstances, but frequent offenders are those programs which
- load code at the top of conventional memory during boot -- and
- therefore during the OPTIMIZE process -- and presume that no
- program will allocate that memory. Programs which place parts of
- themselves at the top of conventional memory typically do so
- without protecting themselves from programs like LOADHI which may
- need to allocate all conventional memory at appropriate times;
- LOADHI (and programs like it) will overwrite the vulnerable code.
-
- As a real-world example, PROTMAN, a program whose purpose in life
- is to manage the loading of various parts of 3Com and MS-LAN
- networks, did this in past versions, as explained in Quarterdeck
- Technical Note #173, PROTMAN.TEC. During the OPTIMIZE process,
- LOADHI would allocate all conventional memory while it was
- determining the size of the various drivers that were being
- loaded. PROTMAN would jump to what it thought was still its own
- code, but there would be LOADHI signatures there -- text -- and
- PROTMAN would crash.
-
- You can see the contents of this string if you Dump the
- instruction you just entered; use DEBUG's D instruction to do
- this.
-
- -d 100
-
- At the leftmost edge of your screen, you'll see a list of
- addresses. At the center and right of your screen, you'll see
- this:
-
- 4F 41 44 48 49 53 49 47-4E 41 54 55 52 BF 42 87 LOADHISIGNATUREB
- 98 FF 6F E2 E9 FF 00 00-26 21 F1 B3 34 00 AF 1D ..o.....&!..4...
- 01 00 D3 E0 0B E8 59 5F-07 B0 00 AA 5F 9D F8 C3 ......Y_...._...
- AA 41 FE 06 AD 90 C3 2E-C7 06 CF 88 00 00 2E 89 .A..............
-
- ASCII codes starting with 2 are generally punctuation marks; bytes
- 30-39 represent numeric digits; 3A-3F are punctuation, 41-5A are
- capital letters, 61-7A are small letters. Any instruction made up
- mostly of these numbers is almost certainly text -- and therefore
- not executable program code. The program that is trying to run
- such an instruction is doing so in error. When the instructions
- are NOT mostly in the 40-80 range, you should try to Unassemble
- them.
-
- -u 100
-
- 20C0:0100 A5 MOVSW
- 20C0:0101 CC INT 3
- 20C0:0102 0000 ADD [BX+SI],AL
-
- This is the killer instruction from the example Exception #13
- above. It's performing a MOVSW (MOVe String Word) at a point when
- the SI register is FFFF, and that means that it's trying to write
- a word value to or from the last byte of a segment, which (as
- described above) is illegal.
-
- Other invalid instructions are harder for the non-programmers of
- the world to interpret. Often the first byte of an invalid
- instruction is 0F -- which is a valid protected-mode instruction,
- but which the processor interprets as an invalid opcode if the
- machine is in Virtual 86 mode. Exceptions of this kind showed up
- more commonly in the past, with programs that were trying to enter
- protected mode without calling the Virtual Control Program
- Interface. VCPI is an industry-standard way for protected-mode
- software to coexist with 386 expanded memory managers such as
- QEMM; all 386 memory managers these days are VCPI-providers, and
- almost all protected-mode programs are VCPI users (or "clients").
- Non-VCPI protected-mode programs include some memory- and
- hardware-diagnostic programs, and programs that use the DPMI
- memory management specification exclusively. Diagnostic programs
- typically recommend that you disable all memory-management
- software during diagnosis. DPMI programs will typically accept
- VCPI memory management; those rare programs that do not will
- simply refuse to start up under QEMM. In such cases, you may
- install QDPMI (the Quarterdeck DPMI Host) on your system; QDPMI is
- available on the Quarterdeck BBS at (310) 309-3227, Compuserve
- (!GO QUARTERDECK), or large local BBS systems.
-
- Q. How can an Exception #13 be fixed?
-
- Quarterdeck Technical Note #241, QEMM: General Troubleshooting
- (TROUBLE.TEC) is a good place to start. This note describes common
- problems and possible solutions, and will help if the cause of the
- Exception #13 is a memory conflict or bus-mastering issue.
-
- If you follow the instructions in TROUBLE.TEC completely, and the
- Exception #13 persists, the prospects for a resolution are bleak,
- since the problem is almost certainly a bug in the offending
- program. If this is so, unless you can alert the developer of the
- program (and make him or her understand all this, which might be
- another task altogether), you can never really make the problem go
- away, although sometimes you may be able to make it subside.
-
- Changing the location of the offending program in memory will
- sometimes help. If you're running under DESQview, and you're sure
- that you've given the program enough memory (i.e., all you can
- give it), try adding 16 to the size of the script buffer on page 2
- of Change a Program. If you're not running under DESQview, try
- adding an extra file handle or two. The key here is to change the
- location of the program in memory, which can occasionally be
- enough to provide temporary relief from the Exception.
-
- There is a substantial caveat: You're not fixing the problem by
- doing this; you're just making it submerge. There's still
- probably a bug in the offending program -- you've just changed it
- from a bomb to a landmine. If you can reproduce the problem
- consistently, you should still contact the publisher of the
- application with all of the data from the Exception message, and
- all of the data that you can supply about your system and its
- current configuration.
-
- With the exception (no pun intended) of the techniques mentioned
- above and in TROUBLE.TEC, non-programmers can do little to fix the
- root cause or even the symptoms of Exception reports. If you are
- unsuccessful in resolving a conflict, the information provided by
- the report should be forwarded, along with a Manifest printout and
- a complete description of your system, to the developer of the
- program that you were running at the time.
-
- ******************************************************************
- * Trademarks are property of their respective owners. *
- * This and other technical notes may be available in updated *
- * forms through Quarterdeck's standard support channels. *
- * Copyright (C) 1995 Quarterdeck Corporation *
- ******************** E N D O F F I L E ***********************
-
-
-